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ABSTRACT 

An effective response to DNA damaging agents 
involves modulating numerous facets of cellular 
homeostasis in addition to DNA repair and cell-cycle 
checkpoint pathways. Fluorescence microscopy- 
based imaging offers the opportunity to simultan- 
eously interrogate changes in both protein level and 
subcellular localization in response to DNA damaging 
agents at the single-cell level. We report here results 
from screening the yeast Green Fluorescent Protein 
(GFP)-fusion library to investigate global cellular 
protein reorganization on exposure to the alkylating 
agent methyl methanesulfonate (MMS). Broad groups 
of induced, repressed, nucleus- and cytoplasm- 
enriched proteins were identified. Gene Ontology 
and interactome analyses revealed the underlying 
cellular processes. Transcription factor (TF) analysis 
identified principal regulators of the response, and 
targets of all major stress- responsive TFs were 
enriched amongst the induced proteins. An unex- 
pected partitioning of biological function according 
to the number of TFs targeting individual genes was 
revealed. Finally, differential modulation of ribosomal 
proteins depending on methyl methanesulfonate dose 
was shown to correlate with cell growth and with the 
translocation of the Sfp1 TF. We conclude that cellular 
responses can navigate different routes according to 
the extent of damage, relying on both expression and 
localization changes of specific proteins. 

INTRODUCTION 

Several DNA repair and cell-cycle checkpoint pathways 
have evolved to cope with damage to the genome that 
can arise from endogenous and exogenous sources (1,2). 



It is well established that effective cellular responses to 
DNA damaging agents involve not only modulation of 
canonical DNA repair and cell-cycle regulation proteins 
but also modulation of a large number of seemingly un- 
related cellular processes (3). Previous studies identified a 
number of these pathways using 'transcriptional profiling' 
and 'genomic phenotyping' in Saccharomyces cerevisiae 
(4-9). Transcriptional profiling quantified global changes 
in mRNA levels in response to genotoxic stress using 
microarrays, whereas genomic phenotyping investigated 
the sensitivity to DNA damaging agents for almost 6000 
S. cerevisiae strains, each deficient in a single gene 
product. A consensus emerged from these and other 
studies (9,10) that a general shutdown of protein synthesis 
occurs under conditions of DNA damage because the 
transcription of ribosomal protein (RP) genes is repressed 
under conditions of genotoxic stress. Later studies 
demonstrated a concomitant preferential translation of 
specific damage-responsive proteins under conditions 
of genotoxic stress (11-13), arguing for translational 
regulation playing a role in the cellular response to 
DNA damaging agents. Finally, a number of post- 
transcriptional protein modifications are known to 
orchestrate the DNA damage response, including phos- 
phorylation, ubiquitylation and sumoylation (1,14). 

To probe response pathways at the single-cell level, we 
developed a quantitative high-throughput fluorescence 
imaging approach to assess not only changes in protein 
levels but also changes in nuclear versus cytoplasmic lo- 
calization in response to the DNA damaging agent methyl 
methanesulfonate (MMS). This assay was performed for 
>4000 S. cerevisiae strains expressing individual GFP- 
tagged fusion-proteins, representing nearly 70% of the 
yeast proteome (15). Importantly, fusion proteins in this 
library are expressed from their native promoters, thereby 
closely reflecting the response of their corresponding genes 
on MMS exposure. 
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Previously, this library was used to study genome-wide 
protein localization using fluorescence microscopy (15) 
and overall expression levels using flow cytometry (16). 
Here, we use the library to identify proteins whose expres- 
sion level is induced or repressed in response to MMS, as 
well as proteins that concentrate in either the nuclear or 
cytoplasmic compartment in response to MMS. Analysis 
of Gene Ontology (GO) and protein-protein interaction 
networks reveal damage-induced changes in levels and/or 
localizations for proteins and complexes involved in chro- 
matin remodeling, mRNA processing, RNA polymerase 
II transcription, proteolysis, ribosome biogenesis, metab- 
olism, lipid synthesis, plus a number of other pathways, in 
addition to canonical DNA repair and cell-cycle regula- 
tion. Further, we characterize the transcription factor 
(TF) networks linked to changes in protein abundance, 
revealing a differential regulation of metabolic versus 
DNA-related processes. Finally, we further investigate 
the unexpected induction response of RPs, finding a dif- 
ferential response depending on the extent of damage, 
cellular growth rate and nuclear-to-cytoplasmic transloca- 
tion of the TF Sfpl that targets the RP genes (17). 

MATERIALS AND METHODS 

Cell growth and culture conditions 

We used the budding yeast GFP fusion library developed 
by Huh and colleagues (15). The haploid parent yeast strain 
was ATCC 201388: MA TMs3A 1 leu2A 0metl5A 0ura3A 0. 
This strain is called wild-type (WT) throughout the manu- 
script. Cells were cultured in minimal SD medium (MP 
Biomedicals) supplemented with amino acids His, Leu, 
Ura and Met. Cells were grown at 30°C. Cells were 
cultured to stationary phase for 3 days and then diluted 
in fresh medium and allowed to grow overnight in triplicate 
cultures. Log-phase cultures were incubated in growth 
medium with or without 0.02% MMS for 3h. Details of 
plate preparation for High Content Imaging are presented 
in the Supplementary Information. For the initial purposes 
of identifying a better fixation method, either one of two 
protocols adapted from previous studies was followed 
(18,19) (Supplementary Information, Supplementary 
Figures S1-S7). For all subsequent experiments, the 
second fixation method (called Fix 2) was followed, as it 
caused lesser loss of GFP fluorescence and also did not 
introduce any additional autofluorescence (Supplementary 
Figures S3-S5). Plates for the screen were prepared in trip- 
licate on a Tecan liquid handling robot (Mannedorf, 
Switzerland) running on EVOware software. 

Cell processing for flow cytometry and fluorescence 
microscopy 

Cells were treated with 1 ug/ml 4 / ,6-diamidino-2-phenyl- 
indole (DAPI) and 2.5 ug/ml Concanavalin A-Alexa647 
(both from Invitrogen Life Technologies) for 30min to 
stain DNA and the cell-wall, respectively. Cells used for 
flow cytometry were resuspended in phosphate buffered 
saline, whereas cells used for imaging were mounted on 
concanavalin A-coated 96-well plates to allow adhesion of 
the cells onto the bottom of the well. Excess cells were 



washed off and the remaining cells were mounted in 
30% glycerol in phosphate buffered saline. Samples were 
prepared in triplicate for both control untreated and 
MMS-treated samples. 

Flow cytometry and fluorescence microscopy 

Flow cytometry was performed on an Accuri C6 Flow 
Cytometer (Accuri Cytometers Inc., Ann Arbor, MI), 
unless otherwise mentioned. Imaging was performed on 
a Cellomics Arrayscan VTi (Thermo Fisher Scientific, 
Pittsburgh, PA), using the XF93 filter to image GFP, 
DAPI and Alexa 647 fluorescence. A 40 x 0.75 Numerical 
aperture (N.A.) air objective was used for imaging. High- 
resolution imaging was performed on an Observer Zl 
microscope (Carl Zeiss, Jena, Germany) with a lOOx 1.4 
N.A. oil immersion objective. 

Image analysis 

Image analysis was performed using MATLAB 
(MathWorks Inc., Natick, MA) using custom- written 
routines for the detection of cellular boundaries from 
images of the cell- wall stained with Alexa 647 conjugated 
Concanavalin A. Clusters of cells were eliminated from 
calculations because nuclear versus cytoplasmic localiza- 
tions could not be correctly computed for these. DAPI- 
stained images of DNA were used for computing nuclear 
positions by thresholding out the nuclei after elimination 
of cells, which showed uniform DAPI staining by setting 
cutoff conditions of intensity and area. Masks of nuclei 
thus obtained were used as a mask on the GFP image to 
compute GFP fluorescence levels in the nucleus. The rest 
of the cell was treated as cytoplasm, enabling the compu- 
tation of nuclear to cytoplasmic ratios. In our approach, 
the mean GFP intensities are evaluated in the whole cell, 
nuclear and cytoplasmic masks. This is distinct from a 
previous work that used an expanded nuclear mask to 
evaluate cytoplasmic fluorescence (20). All raw images, 
data analysis programs and single cell level files associated 
with this study can be accessed at: http://yeastgfpscreen. 
mit.edu/. 

Data analysis 

Experiments were performed in triplicate and the mean of 
each replicate was calculated. The WT control was used to 
estimate the autofluorescence level, which was subtracted 
from all the measured strains in that experiment. WT 
controls were present in every plate. We found that for 
many low-expressing strains when the measured intensity 
is close to autofluorescence, there can be spuriously high 
estimates of fold-changes. Such strains with fluorescence 
levels close to autofluorescence in the control samples 
were retained in the analysis because the expression of a 
protein may be turned on by MMS treatment. To avoid 
spuriously high fold-changes, we added a constant value, c 
(equal to the width of the autofluorescence histogram), to 
every measured value. Although this leads us to underesti- 
mate the fold-changes, this ensures that only substantial 
changes are scored as true responders. Thus, fold-change 
for an experimental strain is calculated as, f = [I M ms — 
(I WT - c)]/[I Con troi - (Iwt - c)], where I denotes mean 
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intensity from triplicate samples, and the subscripts 
denote the conditions and strains. The autofluorescence 
estimated from the WT strain [as also done in a 
previous study (21)] is only a ballpark value, and there 
can be a strain-to-strain variability of true autofluo- 
rescence (16). This approach ensures that changes in 
expression due solely to differences in autofluorescence 
are not scored as responders. However, because 
experiments are performed in triplicate, statistical 
confidence can be attributed even to relatively small 
changes when they pass the thresholds used. The individ- 
ual means, obtained from the three experiments, 
were compared by Student's Mests at P<0.05 and 
f > 1.5 (for induced proteins) or f<0.75 (for repressed 
proteins). For nuclear-to-cytoplasmic ratio (NCR) 
changes by Mests, P<0.05 and IANCRI>0.1 were 
used. ANCR = NCR Co ntroi — NCR MM s? an d a negative 
sign indicates nuclear enrichment, whereas a positive 
sign indicates cytoplasmic enrichment. Typically 
hundreds to thousands of cells were measured for each 
strain. If for any particular well the cell count was <50, 
that well was eliminated from all calculations. 

Protein intensity distributions over cells are often non- 
Gaussian. However, the means arising from repeated 
measurements of such distributions are normally dis- 
tributed, and thus Student's Mest can be used for 
comparing the means obtained from such measurements. 
This assumption of normality does not hold true when 
comparing the underlying distributions of protein inten- 
sities over cells, in which case a non-parametric test is 
preferable. We used the Kolmogorov-Smirnov (KS) test 
for determining responders in terms of levels and localiza- 
tion. A strain was considered to be a responder in terms of 
abundance if KS stat > 0.3 for at least two of the three rep- 
licates and f> 1.5 (for induced proteins) or f<0.75 (for 
repressed proteins). KS stat > 0.3 for at least two of the 
three replicates was also used for NCR changes. For KS 
tests replicates were compared pairwise. For protein levels, 
no AF correction is performed it is not possible to deter- 
mine AF for each cell. We also kept track of fold changes 
and subcellular localization as scored by Huh et al. (15) so 
that true translocations and NCR changes due to abun- 
dance changes can be distinguished. 

The Cytoscape program was used for all network 
analyses (22,23). GO analyses were performed using the 
ClueGO plugin in Cytoscape (24). The YEASTRACT 
website was used for TF analyses (http://www.yeastract. 
com/) (25,26). Further details of statistical methods used 
can be found in the Supplementary Information File. 

RESULTS 

Sample preparation and analyses 

We set out to monitor expression and localization changes 
for 4159 GFP-tagged proteins in the same number of 
S. cerevisiae strains after 3h of exposure to 0.02% 
MMS, a relatively non-toxic dose (4,8). In the parental 
strain, cells showed an expected S-phase arrest and were 
largely viable at this time point, although the culture 
showed a ~40% decrease in colony forming ability 



(Supplementary Figure S8-S10). To address a specific 
timepoint across large numbers of samples using multi- 
spectral fluorescence imaging of distinct cellular compart- 
ments, we developed a fixation strategy to minimize the 
typical attenuation of GFP signal resulting from standard 
yeast fixation protocols. This was particularly important 
to detect changes in expression levels for low copy-number 
proteins that are otherwise obscured by cellular autofluor- 
escence. In a previous study of the 4159 strains expressing 
the GFP-tagged proteins, only 2700 proteins could be 
reliably detected using live-cell flow cytometry (16), and 
of these, the vast majority (85%) of GFP levels resided at 
the low-end of the expression range (16) (Supplementary 
Figure SI). We therefore developed an optimized 
aldehyde-based fixation protocol that minimized intensity 
loss and, importantly, exhibited a linear relation between 
fixed and live cell intensities across the entire range of 
GFP expression so that relative changes upon MMS 
exposure could be accurately measured (Supplementary 
Figures S2-S7). Strains were grown in minimal media in 
triplicate using a robotic liquid handler (see 'Materials and 
Methods' section for details). The importance of replicate 
measurements for generating greater statistical confidence 
in results from high-throughput screens has previously 
been emphasized (27), especially for detecting small but 
biologically significant responses. 

Early log-phase cultures were incubated in control 
medium or medium containing 0.02% MMS for 3h. 
Cell were then fixed, and nuclei and cell-walls were 
stained. (Figure 1A). Custom-written image analysis 
programs were used to quantify GFP fluorescence in seg- 
mented cytoplasmic and nuclear compartments (Figure 
IB, 'Materials and Methods' section). The computed 
intensities from fixed cells in this work compared well 
with previous studies on live cells (Supplementary 
Figures Sll and SI 2). Finally, statistical analyses were 
performed to identify 'responders' defined as strains for 
which protein levels were significantly induced or re- 
pressed, as well as strains for which proteins were 
enriched in the nucleus or cytoplasm as quantified by 
their NCR. Significant differences in mean expression or 
NCR were determined using a Student's Mest with 
significance value of P < 0.05 (Figure 1C). However, 
because the comparison of mean values alone does not 
reveal differences in the underlying protein distributions 
themselves, the KS test was also employed (28-31). The 
KS statistic has been shown to be significantly more 
sensitive in detecting differences in population responses 
compared with the comparison of mean responses (29). 
The KS statistic represents the maximum distance 
between two cumulative histograms, where values equal 
to or greater than 0.2 are suggested to denote biological 
significance in the context of HCS (28,29). In addition to 
Mests comparing mean responses over replicates, we used 
the KS statistic to identify responders in terms of abun- 
dance or NCR changes (Figure 1C and D; 'Materials and 
Methods' section). 'Hit-lists' for induced and repressed 
proteins, as well as nucleus- or cytoplasm-enriched 
proteins, are provided in Supplementary Table SI as 
measured by both Mests and KS tests, with a summary 
of some major responders presented in Table 1. Also listed 
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Figure 1. Overview of sample preparation and analysis. Plates are prepared in triplicate for control and MMS-treated samples (0.02%, 3h). (A) 
Samples are fixed and stained with DAPI to mark nuclei and Alexa-647-conjugated Concanavalin A to mark cell walls. Automated imaging is 
performed on a Cellomics HCS™ fluorescence microscope. A typical raw image is shown for the Rnr4-GFP strain. Scalebar 5 um. (B) Schematic of 
image analysis protocol. Raw cell-wall and DNA fluorescence images are used to segment cellular and nuclear boundaries. Mean GFP pixel intensity 
is evaluated in the full cellular image mask to determine relative protein abundance. The ratio of intensities between the nuclear and cytoplasmic 
masks is used to determine the NCR on a cell-by-cell basis. (C) Mean response of Rnr4-GFP to DNA damage in terms of both levels and 
localization. Protein is induced as measure by autofluorescence (AF)-corrected levels and the NCR decreases significantly indicating that the 
Rnr4-GFP translocates to the cytoplasm on damage. Error bars are standard deviations of the means from the three replicates. P-values are 
determined using a Student's /-test. (D) Mean response over cells does not account for the distinct distributions over cell populations. Thus, the 
same sample as in (C) is evaluated using the KS statistic. The KS statistic is a measure of the maximum distance between the normalized cumulative 
distribution histograms (indicated by the black double-headed arrows). The KS statistic is independently evaluated for each sample pair. Again, a 
significantly higher expression and cytoplasmic translocation of Rnr4-GFP is seen with DNA damage by MMS. In all instances, a KS-statistic cutoff 
of 0.3 is used. At least 200 cells are measured for each curve here. 



is the subcellular localization of each protein, as previ- 
ously determined (15), and its function as listed in the 
Saccharomyces Genome Database. Overall, 415 induced 
proteins, 174 repressed proteins, 133 nuclear-enriched pro- 
teins and 10 cytoplasm-enriched proteins were identified. 
Supplementary Table S2 documents intensity and NCR 
measurements for every strain measured. 

Induced proteins and their interaction networks represent 
a wide range of cellular processes including DNA repair, 
proteolysis, chromatin remodeling and ribosome biogenesis 

GO functional enrichment analysis for the 415 induced 
proteins using the ClueGO plugin in Cytoscape (22-24) 
revealed an expected enrichment of GO-terms related to 
DNA damage response, DNA repair and cellular stress, as 
well as other GO-terms related to cellular metabolism, 
protein degradation, chromatin remodeling, RNA poly- 
merase II transcription, mRNA processing and ribosome 
biogenesis (Supplementary Table S3 and Figure 2A). 
Although the absolute number of genes associated with 
some GO terms can be low, they may still be scored as 
significant when they represent a large fraction of the total 



genes associated with that term. For example, 8/14 (57%) 
proteins associated with the ubiquitin-independent 
proteasomal machinery, and 5/7 (71%) proteins 
associated with trehalose metabolism are induced by 
MMS treatment (Figure 2A). The first indication that 
proteasome function is enhanced and that trehalose me- 
tabolism is affected upon/by MMS exposure came from 
transcriptional profiling studies (6,7). Subsequent work 
has now shown that proteasome-mediated responses are 
directly involved in DNA repair (32), and that trehalose 
protects cells against different DNA damaging agents 
(33-35). Thus, although these and the other GO- 
enriched processes may initially seem disparate, taken 
together they represent a coordinated response to 
cellular insult by MMS. 

Mapping responders onto a previously compiled yeast 
interactome (36) enabled identification of functional 
networks of induced proteins. The full interactome anno- 
tates physical, genetic and TF interactions. The smaller 
network that resulted from mapping MMS-induced 
proteins onto the full interactome had higher connectivity, 
higher clustering coefficient, and a greater 'Large 
Connected Component' (LCC) than random networks of 
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Table 1. Some major responders in all investigated categories 



Induced proteins 



Repressed proteins 



Nucleus-enriched proteins 



Cytoplasm-enriched proteins 



Hugl, Rnr3, Rpslla, Trf4, Tpsl, Ssa4, Rnr4, Snu71, YGR219W, Hsp31, Hspl2, Yhbl, Oye2, Hxkl, Pdc5, 

Hsp26, Lap4, YHR087W, Cep3, Ddr48 
Hugl, Rnr3, Rpslla, Trf4, Tpsl, Ssa4, Fafl, Rnr4, Snu71, Hsp31, Hspl2, Yhbl, Oye2, Hxkl, Hsp26, Lap4, 

Atcl, YHR087W, Cep3, Ddr48 
YDL089W, Erg5, Pho3, Farl, Icl2, Fmp48, YKR077W, YOL047C, Cyc2, Ade5,7, Ecm2, Fmp33, YMR114C, 

Cakl, Astl, Ymc2, YDR065W, Tnal, YMR166C, Pex27 
YBR235W, Ergll, YDL089W, Erg5, Pho3, Farl, Icl2, Fmp48, YOL047C, Cyc2, YKR077W, Ade5,7, Ecm2, 

Cakl, Tnal, YMR166C, Astl, YDR065W, Aahl, Pex27 
Rfa2, Htbl, Tkll, Rfa3, Npl3, Pob3, Rsc9, Nhp6a, Bdfl, Nell, Rsc8, NoplO, Top2, Puf6, Tafl4, Aro4, 

Rpb7, Sthl, Pds5, Acs2 

Nab2, Rfa2, Rsc9, Top2, Rfa3, Pob3, Pop5, Pds5, Rpb7, Taf6, Ies6, Snu71, Npl6, Rlrl, Sthl, Tafl4, 

YLR108C, Sfhl, Spt7, Abfl 
Rpt5, Wtml, Rnr4 

Tdh3, Wtml, Rnr4, Wtm2, Nup2, Aahl, Gsy2, Rpt5, YLR003C, YDR357C 



The top 20 responders (where applicable) in the induced, repressed, nucleus- and cytoplasm-enriched categories are shown. In the blue rows 
are significant responders by /-tests, and in the pink rows are the significant responders by KS tests. For the induced and repressed 
categories, the responders are arranged according to maximal fold change. For the nucleus- or cytoplasm-enriched categories, responders are 
arranged by maximal NCR change for /-tests and by maximal KS statistic for KS tests. For the complete list of responders, see Supplementary 
Table SI. 



similar size (Supplementary Figure SI 3). These general 
features were previously observed in network analyses of 
toxicity modulating genes in genomic phenotyping studies 
(37). Although the protein-protein and genetic interaction 
maps revealed a number of clusters that were expected 
from the GO functional enrichment analysis (namely, 
DNA damage response, ribosome biogenesis, chromatin 
remodeling, RNAPII transcription and proteolysis), such 
mapping also highlighted additional functions that were 
not represented in the GO analysis (Figure 2B). For 
example, all four proteins known to be associated with 
the cohesin complex (38) are induced. This group was 
not represented in the GO analysis because the number 
of proteins in the cohesin complex is so small that it falls 
below the threshold used for filtering GO terms. It turns 
out that previous studies also demonstrated a role for the 
cohesin complex in DNA damage response beyond its 
conventional role in sister-chromatid cohesion (39-41). 
We also identified a distinct cluster of heat shock 
proteins (Figure 2B), consistent with recent studies that 
demonstrate links between heat shock and DNA damage 
responses (42,43). Finally, a group of proteins involved in 
mRNA processing was present among the subnetworks of 
induced proteins. Interestingly, a recent genomic 
phenotyping study involving both essential and non-essen- 
tial genes also highlighted the importance of mRNA pro- 
cessing and splicing proteins in governing sensitivity to the 
toxic effects of MMS (8). MMS also induces damage to 
proteins, RNA and other cellular components, in addition 
to DNA. The modulation of a large number of RNA pro- 
cessing and ribosomal genes may also be signaled from 
protein and RNA damage in conjunction with the direct 
DNA damage response. 

Partitioning of biological function among induced proteins 
according to number of upstream TFs 

In addition to physical and genetic interactions, we 
examined TF interactions by mapping induced proteins 
onto the global TF network constructed from the 



YEASTRACT database (25,26,36), thereby identifying 
eight TFs that are themselves induced (Yapl, Cst6, 
Cin5, Dot6, Xbpl, Pho2, Dal81, Warl— circled in green 
in Figure 3A); these eight TFs collectively have the poten- 
tial to regulate the expression of 54% of the induced 
proteins (Figure 3A). Deletion mutants for five of the 
eight TFs (Yapl, Cst6, Cin5, Dot6, Dal81— see red text 
in Figure 3 A) showed MMS sensitivity in the genomic 
phenotyping assay (5), and Xbpl is known to be a 
stress-responsive TF. In general, 11% of all possible 
targets for these eight TFs are represented among the 
induced proteins. Although this is higher than the 6.5% 
that would be expected for random networks of the same 
size, it is still a relatively small fraction of all possible 
targets. Although this may be due in part to the fact 
that the GFP library represents only 70% of the yeast 
genome, it more likely indicates a tendency for cells to 
require combinations of TFs to modulate gene expression 
in response to environmental challenges, rather than 
allowing promiscuous non-specific upregulation of all 
possible targets for a single TF. Although a number of 
the induced proteins are targeted by more than one of 
the eight TFs (i.e. have an indegree >1, Figure 3A), 
these eight MMS-induced TFs by no means represent all 
of the TFs that are potentially capable of regulating the 
induced proteins. 

To obtain a comprehensive picture of all TFs that might 
govern the expression of induced proteins, irrespective of 
whether the TFs themselves were induced, we used the 
updated YEASTRACT database that documents all 
known major TF interactions in the yeast genome 
(25,26). This analysis identified 59 TFs, each of which po- 
tentially governs the expression of at least 5% of the 
induced proteins. Most induced genes were governed by 
more than one of the 59 TFs, with four as the median 
value of node in-degree, which represents the number 
of TFs governing the expression of a target protein 
(Figure 3B). 
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Figure 2. Functional enrichment and protein-protein interactions of induced proteins. Proteins that showed fold change, f > 1.5 and passed the 
statistical criteria were used. (A) The top 1 5 functional categories as determined by a GO analysis using Cytoscape are shown. Negative logarithms to 
the base 10 of the P-values for the GO terms are plotted. Numbers in parentheses show the percentage of genes associated with a GO term, which are 
found to be induced. The dashed line shows the position of P = 0.05 in all figures. (B) Induced proteins are projected onto a yeast interactome. Blue 
lines denote physical protein-protein interactions, and red lines denote genetic interactions. The weakest edges have been removed to parse out 
isolated modules. Isolated single nodes are not shown. The text on the network is color coded to represent the broad cellular processes represented by 
the corresponding nodes. Gray nodes do not conform to these categories but are still not isolated single nodes. 



To test whether the induced protein nodes that can be 
governed by a large number of TFs serve distinct biolo- 
gical functions compared with more isolated nodes, we 
filtered out genes that can be targeted by four or fewer 
of the 59 putative TFs (representing proteins that are 
regulated by relatively few TFs) from genes that can be 
targeted by more than four TFs (representing proteins 
that are regulated by many TFs). GO analysis for these 
two sets of MMS-induced proteins revealed a clear differ- 
ence in functional enrichment, with P-values as low as or 
lower than those for the combined set of all induced 
proteins shown in Figure 2A (Figure 3C; Supplementary 
Table S4). Interestingly, induced proteins associated pri- 
marily with metabolic processes were found to be targeted 
by five or more TFs, whereas DNA damage response and 
chromatin remodeling proteins represent the more isolated 
nodes that are targeted by four or fewer TFs. In the full 
transcription network, the median value for indegree is 
also four, which is similar to the subset of MMS- 
induced proteins (Supplementary Figure SI 4). Not sur- 
prisingly, when genes are partitioned as mentioned previ- 
ously in the full network, DNA-related processes do not 
dominate the more isolated nodes because all cellular 



functions are now represented. However, metabolic 
processes continue to be over-represented among the 
nodes with five or more upstream TFs, indicating that 
this may be a generic feature of the transcriptional 
network (Supplementary Figure SI 4). 

Enrichment of lipid biosynthesis and membrane trafficking 
processes among repressed proteins and networks 

In addition to induced genes, transcriptional profiling 
studies have revealed many genes whose transcripts are 
downregulated by MMS (6,7,9). At the protein level, 
one might expect fewer repression responders due to the 
longer half-lives of proteins compared with mRNAs, 
except in the case of targeted protein degradation. In 
other words, even for a gene that is transcriptionally 
silenced, its protein products may remain in the cell for 
some time, particularly under conditions of inhibited 
growth. In our screen, 174 repressed proteins were 
identified, even with a lower cut-off threshold of 25% re- 
duction compared with the 415-induced proteins that 
showed >50% induction. P- values for GO-function 
enrichment were generally larger for repressed versus 
induced proteins (Supplementary Table S3 and 
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Figure 4A), and the network of repressed proteins had 
lower connectivity than the induced proteins as 
measured by their connectivity that was comparable 
with random networks of the same size (Supplementary 
Figure SI 5). However, cellular membrane organization, 
secretion, trafficking and peroxisome organization were 
significantly over-represented among the repressed 
proteins (Figure 4A and B). Also, the single repressed 
TF, Phdl, is known to control a number of sterol biosyn- 
thesis genes known to be important for lipid and 
membrane biosynthesis (Figure 4C). Finally, several 
proteins involved in chromatin remodeling and regulation 
of the mitotic cell-cycle were also part of the network of 
repressed proteins (Figure 4B), even though none of these 
processes was significantly enriched in the GO analysis 
(Figure 4A). 

We used the YEASTRACT database to identify 52 TFs 
that can target the expression of 5% or more of the 



repressed proteins. Most of these (47/52) are found 
among the TFs that target genes for the induced 
proteins, presumably because some of these TFs (e.g. 
Yapl, Sfpl and Stel2) have high numbers of targets in 
the genome (Figure 4D). However, although we observed 
highly significant enrichment of the targets of these TFs 
for the induced proteins (P = 1.8 x 1CT 8 overall, Figure 
4G), the enrichment was not nearly so high 
(P = 9.4 x 10 -4 overall, Figure 4G) for repressed 
proteins (Figure 4E-G). The number of targets for each 
TF in both lists was divided by the number of targets 
expected by random chance from the whole genome, to 
assess specific enrichment of the targets for a given TF in 
each list. This measures the normalized occurrence for 
each TF (thus a normalized occurrence of 1.5 indicates 
that 50% more targets are present in the induced or re- 
pressed list when compared with random lists of the same 
size that exhibit a mean normalized occurrence of 1). This 
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analysis revealed the targets of TFs for all three major 
stress-responsive pathways in yeast [genes with heat 
shock elements, stress response elements or AP-1 respon- 
sive elements (6,9,44-46)] to be particularly enriched 
among the induced proteins. TFs Yapl, Msn2, Msn4, 
Rpn4, Hsfl, Met4 all exhibit a high normalized occur- 
rence among the induced proteins (Figure 4E, dotted 
circle). There were other TFs that showed yet higher 
normalized occurrence, but the absolute number of 
targets for these TFs in the genome was low, and in 
effect, they targeted only several of the induced genes. 

Signatures of translational regulation of induced proteins 

Cells use both transcription and translation to regulate 
gene expression. Transcriptional responses to DNA 



damaging agents are well documented (6,9), and recent 
studies have unveiled a translational component to such 
responses (12,13). The study by Begley et al. (2007) com- 
putationally identified a set of 425 genes with a skewed 
codon usage pattern such that their translation would be 
promoted by the Trm9 tRNA methyltransferase that cata- 
lyzes specific tRNA modifications that change codon-anti- 
codon affinity. Such modifications affect the efficiency of 
translation for a subset of transcripts rich in specific 
codons, especially under conditions of DNA damage 
(13). Ribosomal, metabolism and stress response genes 
were enriched in this group of 425 potential preferentially 
translated (PPT) genes. However, belonging to the PPT 
group does not ensure induction under conditions of 
damage, for transcriptional components can offset trans- 
lational responses. Despite this, we find 57 PPT proteins 
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among the MMS-induced proteins, approximately twice 
the number that is expected by random chance 
(Supplementary Figure S16A). These 57 proteins form a 
close network with several RP genes; GO analysis reveals 
enrichment of processes related to carbohydrate and tre- 
halose metabolism, ribosome biogenesis, oxidative stress 
and deoxyribonucleotide production (Supplementary 
Figure S16B and C). 

Nucleus and cytoplasm enriched proteins and networks 

The list of proteins that become enriched in the nucleus in 
response to MMS includes two categories: (i) nuclear 
proteins whose relative expression increases in response 
to MMS and (ii) proteins that translocate from the cyto- 
plasm to the nucleus. Although not all induced nuclear 
proteins are represented in this list, instances where the 
NCR increases simply reflects the induction of a protein 
were retained because they captured subtle expression in- 
creases in the nucleus not represented in the total list of 
induced proteins (Supplementary Table SI). For example, 
Ubi4, yeast ubiquitin, is scored as both an induction and a 
nuclear enrichment responder, but Rxt2, a subunit of the 
histone deacetylase Rpd3 complex implicated in the acti- 
vation of DNA damage induced genes (47) is found only 
in the nuclear enrichment list. Remarkably, GO analysis 
of this relatively small list of proteins (133) produced 
extraordinarily high-significance values for functional en- 
richments, and perhaps not surprisingly, most of the 
nuclear-enriched functional categories pertained to DNA 
and RNAPII-related processes, with a notable absence of 
the metabolic, proteasomal and ribosomal processes seen 
in the total list of inductions. P-values for functional en- 
richment (Supplementary Table S3 and Figure 5 A) were 
substantially lower even than those for the induced 
proteins as a whole (Figure 2A), and the network of 
nuclear-enriched proteins exhibited connectivity far 
greater than what would be expected for similarly sized 
random networks, reflecting the close functional inter- 
actions among these proteins (Supplementary Figure 
SI 7). For several GO terms associated with the nuclear- 
enriched proteins, >25% of all associated proteins were 
represented (Figure 5A). The physical and genetic protein 
interaction networks were dominated by components of 
chromatin remodeling, RNAP II-dependent transcription, 
plus mRNA and snoRNA processing proteins. 
Components of the cohesin complex were also present 
among the nuclear-enriched proteins. 

TF network analysis identified three nuclear-enriched 
TFs (Stel2, Abfl, Dot6) that regulate a number of 
proteins in the list of nuclear-enriched proteins. 
Although other TFs such as Ixrl previously implicated 
in the DNA damage response (48,49) were not part of 
this network, they were present in the list of nuclear en- 
richments (Supplementary Table SI). These are instances 
of subtle inductions that were only seen in the nucleus but 
missed in the total list of induced proteins. Conversely, 
two proteins that are a part of the induction list and 
known to be nuclear-enriched upon damage (Ubcl3, 
Yapl) were not found in this study. Although Yapl 
with a KS-statistic value of 0.28 lay just below the cutoff 



of 0.3 for nuclear enrichment, it should be noted that in 
previous work, more than a 10-fold higher MMS dose was 
used to induce its nuclear translocation, compared with 
the present study (50). Thus, nuclear enrichment of a 
protein can be dose-dependent. 

Evidence from several distinct lines of work have 
demonstrated connections between chromatin remodeling 
and the DNA damage response in eukaryotes 
(4,6,7,9,47,51-53). Our data (Figures 2B and 5B) also 
indicate extensive changes in proteins for the remodeling 
of chromatin and for nucleosome disassembly that are 
mobilized on exposure to MMS, presumably to allow 
repair machinery access to sites of DNA lesions. 

Surprisingly, few cytoplasmic enrichments (10 proteins) 
were identified in our screen (Supplementary Table SI). 
This may in part be due to the fact that the cytoplasmic 
volume is significantly larger than the nuclear volume, 
rendering it difficult to detect translocation of low-ex- 
pressed proteins. However, even amongst this small 
number of proteins, interesting features stand out. For 
example, we find that the Wtml and Wtm2 proteins trans- 
locate out of the nucleus into the cytoplasm in response to 
MMS, as does Rnr4 (Figure 5D and E). Wtm proteins are 
involved in nuclear anchoring of the RNR small-subunits 
(54-57), with one study previously implicating the Wtm 
proteins in the control of RNR transcription (58). It 
appears that controlling the subcellular localization of 
the Wtm proteins may provide an additional mode of 
RNR regulation. 

RP response to MMS is dose dependent 

A surprising GO category for induced proteins was 
ribosome biogenesis and components of the ribosomal 
machinery because previous studies from several groups 
have shown that ribosomal genes are generally transcrip- 
tionally repressed under conditions of DNA damage 
(6,7,9,10). The regulation of ribosomal genes is thought 
to be primarily at the level of transcription in yeast (59), 
but almost 90% of all RP genes are also found in the list 
of PPT genes (13), suggesting cells use both transcription 
and translation to tune ribosome numbers (also see 
Supplementary Figure SI 6). A recent study similar to 
ours (discussed in more detail later in the text) also 
identified this group among induced proteins (20). To in- 
vestigate this apparent discrepancy, we examined two such 
induced RPs (Rpl7a-GFP and Rps22a-GFP) using live- 
cell flow cytometry. We found that at moderate doses of 
MMS (0.02%, 2h), both proteins were induced, whereas 
at higher MMS dose (0.1%, 2h), there was a small but 
significant repression of the proteins (Figure 6A). The 2-h 
time-point was chosen because it was intermediate 
between the 3-h time-point at which the present screen is 
performed, and the 1-h time-point of the first transcrip- 
tional profiling study that revealed transcriptional repres- 
sion of the ribosomal genes at 0.1% MMS (7). This 
suggests, not surprisingly, that cellular response differs 
according to the extent of damage, presumably depending 
on whether the cell can repair damage and proceed 
through the cell-cycle, or whether growth halts com- 
pletely. A similar pattern of induction was seen in 
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Tandem Affinity Purification (TAP)-tagged strains of 
Rpl7a and Rps22a (Supplementary Figure SI 8). Because 
ribosome biogenesis is intimately linked with cell growth, 
we investigated how cell growth characteristics differ at 
the two MMS doses. 

Using both liquid and agar assays, we found that at 
0.02%, MMS cells proceed through the cell-cycle, albeit 
slowly, whereas at 0.1%, MMS cell growth is abrogated 
entirely (Figure 6B and C, Supplementary Figure SI 9). 
Conditions of general cellular stress inhibit growth, and 
at the same time, ribosome biogenesis in a Target of 
Rapamycin (TOR)-dependent manner (17,59). The TOR 
pathway has also been shown to control DNA damage 
responses by controlling dNTP production (60), and the 
TOR pathway effectors Sch9 and Sfpl are known to be 
involved in both ribosome biogenesis and stress responses 
(17,61,62). Sfpl is a nuclear-localized TF that regulates 
RP expression and translocates to the cytoplasm in 
response to various stresses including DNA damage by 
MMS (0.1%), thus turning off RP gene expression (17). 
Sfpl itself is also induced by MMS damage, which may 
appear to conflict with the aim of shutting down ribosome 



biogenesis (63). We therefore investigated the levels and 
localization of Sfpl-GFP in response to moderate and 
high doses of MMS. At both doses, Sfpl-GFP was 
induced as assessed by live-cell flow cytometry (Figure 
6D). Indeed, although Sfpl-GFP showed only a 27% 
increase in the original screen, and was not scored as 
induced (only proteins with >50% increase in expression 
were considered to be responders), this small increase was 
significant (P = 0.003). Expression of Sfpl-GFP in the 
absence of damage is low, making estimations of fold 
change difficult. Our conservative estimates of fold 
change in the global screen are systematically 
underestimated when the fluorescence signal is close to 
cellular autofluorescence (see 'Materials and Methods' 
section), causing Sfpl-GFP to be absent from the list of 
induced proteins. However, when observed with higher 
resolution microscopy, at the moderate MMS dose 
(0.02%), Sfpl-GFP was induced and clearly nuclear, 
thus being available to upregulate the expression of RP 
genes. At the higher dose (0.1% MMS), Sfpl-GFP, al- 
though still present at induced levels, became cytoplasmic, 
concomitant with the repression of ribosomal genes. This 
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Overlays of GFP and phase images are presented. The scalebar is 5|im. 



provides a clear example where tuning both the abundance 
and the nuclear-versus-cytoplasmic localization of a 
protein can be used by cells to effect induction and repres- 
sion of gene expression. 

DISCUSSION 

Identification of repressed proteins 

The present study provides a comprehensive single-cell- 
level view of an orchestrated cellular response to damage 
induced by MMS. Previous studies using the yeast GFP 
fusion library to interrogate protein level changes in 
response to MMS, by imaging (20) or flow cytometry 
(21) used a single replicate of live cells. Here, fixation of 
cells was essential to ensure a specific and consistent 
exposure time to the damaging agent, as well as unam- 
biguous identification of the cell nucleus and cell-wall 
without expression of an additional fluorescent protein 
as a nuclear marker. In previous studies (20,21), no 
proteins met the cutoff set for downregulation, even 



though many genes are known to be transcriptionally re- 
pressed in response to MMS (6,7,9). As discussed, protein 
level repressions may manifest as small changes at early 
time points after damage exposure, and hence replicate 
measurements may be necessary to identify them 
reliably. Similar doses and time points were used in all 
these studies: Lee et al. used 0.02% MMS for 4h; Tkach 
et al. used 0.03% MMS for 2h, whereas we use 0.02% 
MMS for 3h. Here, triplicate measurements enabled the 
identification of a number of proteins whose expressions 
are reduced on DNA damage. Processes of membrane- 
trafficking, lipid synthesis and peroxisome function were 
enriched among the repressed proteins. In breast cancer 
models, mutations in the critical tumor suppressor protein 
p53 have been associated with expression of sterol biosyn- 
thesis genes (64). In mouse liver, activation of the protein 
Nrf2 by oxidative stress has been shown to be associated 
with downregulation of lipid biosynthesis, to allow the 
scavenging of reactive oxygen species (ROS) by reduced 
nicotinamide adenine dinucleotide phosphate (NADPH) 
that functions both in lipid metabolism pathways and in 
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ROS scavenging (65,66). Peroxisomes may act as a source 
of ROS that affect nucleic acids, proteins and lipid, and 
they are kept under tight control in yeast (67); indeed, 
peroxisome proliferation causes oxidative DNA damage 
in rat livers and plays a role in hepatocarcinogensis (68). 
Although these findings span a number of distinct model 
systems, they provide plausible explanations for the genes 
repressed by MMS treatment, as seen in this study, and 
indicate that these may be general features of regulation 
under conditions of damage. 

Comparisons with other studies 

Despite the attenuation of GFP signals on fixation, the 
intensities measured in fixed cells in this work compared 
well with previous studies on live cells performed with 
both flow cytometry and imaging methods (16,20) 
(Supplementary Figures Sll and S12). Further, MMS- 
induced protein groups are largely similar between this 
study and that by Tkach et al. (20). In terms of protein 
localization, manual annotation of protein localization in 
the prior study allowed finer discrimination of subcellular 
localization into organelles, but at the cost of replicate 
measurements. In comparison, here, we measure broad 
nuclear versus cytoplasmic localization in a fully auto- 
mated manner and identify several highly nuclear- 
enriched DNA-related processes, as well as abundance 
changes that span a wide variety of cellular responses. 
Although we lose detail in the granularity of subcellular 
organelles, we detect GFP intensity automatically in the 
full nuclear and cytoplasmic masks, combined with 
greater statistical confidence from replicate measurements. 

The number of responders identified in this study is 
substantially fewer than the transcriptional profiling 
studies. However, comparison with an early transcrip- 
tional profiling study from our laboratory (7) reveals 
that 62% of all the induced proteins are also transcrip- 
tionally induced in response to MMS. In contrast, only 
6.3% of the repressed proteins were found to be transcrip- 
tionally repressed. Indeed, although correlations between 
transcript and protein levels in unperturbed cell popula- 
tions are poor (69,70), it has recently been shown that 
under conditions of environmental stress, transcript in- 
duction correlates well with protein induction, but tran- 
script reduction produces negligible change in protein 
levels (71). Protein level repression under conditions of 
DNA damage presumably occurs due to targeted degrad- 
ation (72) instead of transcriptional repression. Several 
studies from our laboratory, including the present one, 
have indicated a key link between the proteasomal ma- 
chinery and response to MMS-induced damage (6,73). 

TF network analysis 

Extensive TF analyses identified Yapl as a major induced 
TF. Yapl has many targets in the yeast genome, and its 
role in stress responses is well-established (45,50,74). In 
addition to causing direct alkylation damage, MMS can 
also elicit an oxidative stress response in cells, and Yapl 
was recently shown to be a major regulator of this 
response (74). We also found specific enrichment of the 
targets of several other major stress-responsive TFs 



(Rpn4, Msn2, Msn4, Hsfl and Met4) among the 
induced proteins. Combinations of a large number of 
TFs were upstream of the observed metabolic response 
genes, whereas fewer TFs directly targeted genes for 
DNA-related processes. 

It should be borne in mind that the network analyses 
performed here projects responders onto static protein- 
protein, genetic and TF interaction networks. It is 
thought that genetic interactions may shift on DNA 
damage, whereas physical interactions remain largely un- 
changed (75). TF combinations affecting a gene may also 
shift on DNA damage (49), and the periodically updated 
YEASTRACT database documents most known TF 
interactions under various conditions from different 
sources, based on both bioinformatic and experimental 
data. However, under any one condition, only a subset 
of the targets is likely to be active for a given TF. 

DNA-related processes in nuclear-enriched proteins 

Interestingly, nuclear-enriched proteins are found to form 
a closely connected network, with a corresponding signifi- 
cant enrichment of DNA-related processes that include 
transcription, chromatin remodeling and DNA repair, 
and a notable absence of proteins of the metabolic, ribo- 
somal and proteolytic machinery. Thus, the list of nuclear- 
enriched proteins clearly segregates the central DNA level 
responses from the other cellular processes that together 
orchestrate the total cellular response to MMS insult. 

Differential response of RPs according to MMS dose 

Finally, differing responses for several RPs were observed 
depending on the MMS dose. These responses correlated 
with altered cell growth and the cytoplasmic translocation 
of the Sfpl TF that results in shutting off RP gene expres- 
sion. Although Sfpl is a central player for RP gene ex- 
pression, other factors also affect decisions to either stall 
growth or to repair damage and proceed through the cell- 
cycle. Recently, proteins of the cohesin complex have been 
shown to be involved in RP gene regulation (76), in 
addition to direct DNA damage responses (39,40). The 
cohesin complex proteins are upregulated under the con- 
ditions of damage used here, which could in turn feed 
back onto RP expression. Future work will explore 
whether the upregulation of specific RPs is for the replace- 
ment of damaged ribosomal components (77,78) or to 
make functionally specialized stress-specific ribosomes 
(79,80) or perhaps both. 

Outlook 

Taken together, our work presents a global systems-level 
proteomic view of the cellular response to MMS damage. 
Transcription may represent the first level of regulation, 
but now, we reveal the protein responders in terms of 
levels and broad subcellular localization. Although 
focused studies investigate a few genes in isolation, intri- 
cate connections within the yeast proteome suggest that 
no response can be isolated from cascading effects within 
the network. Yet, despite the seemingly daunting complex- 
ity, the overall picture that emerges reveals a coordinated 
response in terms of DNA repair, chromatin remodeling, 
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proteolysis and cell growth, indicating that the system is 
tuned to buffer a fairly large range of genotoxic 
challenges. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online, 
including [81-95]. 
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